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i — i 1 Abstract 

£^ This note extends the analysis of incremental PageRank in [B. Bahmani, A. 

^ Chowdhury, and A. Goel. Fast Incremental and Personalized PageRank. VLDB 

O 2011]. In that work, the authors prove a running time of 0{J^ ln(m)) to keep 

PageRank updated over m edge arrivals in a graph with n nodes when the algo- 
I rithm stores R random walks per node and the PageRank teleport probability 

^ is e. To prove this running time, they assume that edges arrive in a random 

CD order, and leave it to future work to extend their running time guarantees to 

adversarial edge arrival. In this note, we show that the random edge order as- 
sumption is necessary by exhibiting a graph and adversarial edge arrival order 

in which the running time is Q fRn 1+lg ^^~ e 'j . 



2 Introduction 
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. £^ In pp, Bahmani, Chowdhury, and Goel propose a method of keeping an approx- 

imation to PageRank updated as edges from a graph arrive online. They use 
the Monte Carlo method of computing PageRank [2]. In this method, we start 
at each node in the graph and take a random walk. After each step of a random 
walk, we terminate the walk with probability e, the teleport probability, and call 
it complete. With the remaining probability 1 — e the walk is incomplete and 
we continue the walk. If we reach a node u with outdegree the walk stops, 
but in this case the walk will continue if an edge from u arrives. To reduce 
variance, we take R random walks per node, where R might be a constant or 
log(n), depending on the accuracy required. When a new edge [u, v) is added 
to the graph, we consider revising each walk which passed through u, since it 
perhaps should have used this new edge. The probability that the walk should 
have used this new edge is were d(u) is the new outdegree of u. We flip a 
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biased coin for each walk through u, and with probability we throw away 
the remainder of the walk and generate a new remainder starting with v. To 
make sure that the length of each complete walk is geometrically distributed 
with expected length we preserve the length of the original walk when we 
generate a new remainder. 

When the graph is chosen by an adversary but edges arrive in a random 
order, Bahmani, Chowdhury, and Goel [1] prove that the number of walks which 
need to be updated over m edge arrivals is ^ ln(m) where n is the number of 
vertices, R is the number of stored walks per vertex, and e is the teleport 
probability. They state that it would be an interesting result to extend their 
running time guarantees to adversarial edge arrival. This motivates the following 
question: does their algorithm require at most ^ ln(m) segment updates for 
an adversarially chosen edge order? Our contribution is answering this question 
in the negative. 

3 Result and Theory 

Theorem 1. There exists a graph and edge order such that the total number of 
walk segments updated as the edges arrive is 

For example if e — .2 the number of updates is = D, (i?n 126 ). Hence the PageR- 
ank algorithm in J^/ does not run in time in time 0(Rn\og(m)) in the adver- 
sarial graph and adversarial edge model. 

Proof. For any power of two n, we describe how to construct a graph on 2n — 1 
nodes. The case n = 16 is shown in figure [T] with blue labels indicating the 
order of edge arrival. There is a top row of n nodes each connected to the root 
of a balanced binary tree of n — 1 nodes. The top row of edges arrives first, 
creating nR walk segments to the root of the binary tree. The edges in the 
tree arrive in a depth-first traversal of the bpinary tree starting at the root, so 
pagerank is funneled toward a leaf before being diluted among the branches. 
The left edge leaving each vertex arrives before the right edge, so when the left 
edge leaving a vertex u arrives, any incomplete walk through u will need to be 
updated. When the right edge leaving a vertex u arrives, any incomplete walk 
will need to be updated with probability |. Consider the probability that a 
walk from the root needs to be updated when an edge (it, v) arrives to a node 
u in row i. First of all the walk needs to have length at least i which happens 
with probability (1 — e)\ Now as we trace the unique path from the root to 
u, the probability that a walk follows this path is the probability that it takes 
the correct edge (right or left) leaving each vertex. Because the left edge to 
each vertex on any path from the root arrive before the right edge, the walk 
is guarenteed to follow the path to u at left edges, while at right edges it has 
probability | of going towards u. Thus if there are k left edges on the path and 
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Figure 1: The case n = 16 of our graph. Edges are labeled with the order in 
which they arrive. 

i — k right edges, the probability that a walk of length at least i will reach u is 
. Since there are nR paths from top nodes which could potentially reach 
u, the expected number of paths which need to be updaded when edge (u, v) 
arrives is 

n*(l-e)<Q)\ 

In the ith row, there are Q) nodes which can be reached via k right branches 
and i—k left branches from the root. Thus the total expected number of path 
segments updated due to edges in the «th row is 

after applying the binomial theorem. There are lg(n) rows in the tree, so the 
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total number of path segments updated is 




Rn 



and the result follows. □ 
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